Skip to content

Fix emulation accuracy: resolution, 68K instructions, events, RAM init, logging#119

Merged
JoeMatt merged 83 commits intolibretro:masterfrom
Provenance-Emu:fix/general-emulation-fixes
May 1, 2026
Merged

Fix emulation accuracy: resolution, 68K instructions, events, RAM init, logging#119
JoeMatt merged 83 commits intolibretro:masterfrom
Provenance-Emu:fix/general-emulation-fixes

Conversation

@JoeMatt
Copy link
Copy Markdown
Collaborator

@JoeMatt JoeMatt commented Apr 23, 2026

Summary

  • TOM resolution pipeline: Use HDB1/HDE/VDB/VDE registers with pwidth for dynamic resolution instead of hardcoded constants — fixes black screen / cyan bar on many ROMs (e.g. 240p test suite)
  • 68K instruction emulation: Emulate MULL/DIVL (68020+) via IllegalOpcode trap; fix BSR.L absolute address for Atari aln linker — needed by modern homebrew
  • Event system: Fix GetTimeToNextEvent() reading uninitialized slot[0] time without validity check, causing periodic execution bursts and visible stuttering; zero eventTime on init
  • RAM init: Skip randomization over loaded executable region for ABS/COFF files loaded at $4000+
  • JERRY timers: Fix PIT2 using PIT1's prescaler/divider; disable JERRY trace spam (was on every 44kHz interrupt)
  • Libretro logging: Add log.h framework (LOG_DBG/INF/WRN/ERR) using retro_log_printf_t so verbosity is toggleable in RetroArch UI; convert fprintf(stderr) calls in DAC, GPU, JERRY, TOM
  • GPU: Add trace debug guard macro (disabled by default), GPUDumpState() diagnostic function

Test plan

  • Builds clean on macOS (Clang) and Linux (GCC)
  • 240p test suite boots and renders correctly (was black screen on master)
  • Scrolling bars test runs without periodic stuttering (event system fix)
  • Regression test against Tempest 2000 and other known-working titles
  • CI passes

🤖 Generated with Claude Code

JoeMatt and others added 2 commits April 23, 2026 19:42
General emulation fixes extracted from the CD development branch,
with no CD-specific code included.

- TOM: Use actual HDB1/HDE/VDB/VDE registers for resolution calculation
  instead of hardcoded visible area constants. Fixes games that set
  non-standard display windows (240p test suite, Doom).
- 68K: Emulate 68020 MULL/DIVL instructions via IllegalOpcode trap,
  needed for m68k-atari-mint-gcc / Removers Library toolchain games.
- 68K: Fix BSR.L (opcode $61FF) for Atari 'aln' linker which writes
  absolute addresses instead of PC-relative displacements.
- RAM: Skip randomization over loaded executable region during reset,
  fixing RAM-loaded ABS/COFF files that were destroyed by JaguarReset().
- Logging: Add libretro log interface (src/log.h) with LOG_DBG/INF/WRN/ERR
  macros routed through retro_log_printf_t, toggleable in RetroArch UI.
- JERRY: Disable JERRY_TRACE_DEBUG (was fprintf on every 44kHz interrupt).
- GPU: Add GPU_TRACE_DEBUG guard for trace output.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
GetTimeToNextEvent() was reading slot[0].eventTime unconditionally as the
initial minimum, even when slot[0] was invalid—using uninitialized garbage
values. This caused periodic execution bursts where the 68K would run for
huge timeslices, producing visible stuttering.

Start with time=1e30 sentinel and iterate from slot 0 with validity checks.
Also zero eventTime in InitializeEventList() to prevent stale values.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves Virtual Jaguar libretro core emulation accuracy and debuggability by fixing video timing/resolution handling, 68K instruction edge-cases, event scheduling stability, RAM initialization for RAM-loaded executables, and adding a configurable logging framework.

Changes:

  • Derive TOM render width/height from HDB/HDE/VDB/VDE + PWIDTH rather than fixed constants; add TOM IRQ latch/control helpers.
  • Emulate 68020+ MULL/DIVL via IllegalOpcode handler and apply a Jaguar-specific BSR.L/aln absolute-target quirk.
  • Stabilize the event scheduler init/selection logic; preserve RAM-loaded executable regions across reset; add libretro-backed LOG_* macros and reduce trace spam.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/tom.c Dynamic resolution calculation and new TOM IRQ latch/control helpers
src/m68000/m68kinterface.c Illegal opcode handling extended to emulate 68020 MULL/DIVL
src/m68000/cpuemu.c Jaguar-specific BSR.L absolute-target handling for aln-built binaries
src/log.h New logging wrapper (LOG_DBG/INF/WRN/ERR) using libretro log callback with stderr fallback
src/jerry.c PIT2 prescaler/divider fix and trace logging guarded by compile-time flag
src/jaguar.h Exposes loaded-RAM region bounds for reset logic
src/jaguar.c Reset RAM randomization skips RAM-loaded executable region
src/gpu.h Declares new GPU diagnostics helpers (GPUIsRunning/GPUDumpState)
src/gpu.c Adds logging include and compile-time GPU trace macro
src/file.c Tracks RAM-loaded executable region bounds when loading ABS/COFF/JagServer formats
src/event.c Initializes eventTime fields and avoids selecting invalid slot[0] by default
src/dac.c Minor control-flow brace adjustment in DACWriteWord
libretro.c Integrates new logging callback and reloads RAM-loaded executables after reset

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/tom.c Outdated
Comment thread src/m68000/m68kinterface.c Outdated
Comment thread src/jaguar.c Outdated
Comment thread src/tom/gpu.h
JoeMatt and others added 7 commits April 23, 2026 21:13
… GPU stubs

- Fix DIVL divide-by-zero exception: advance PC by full instruction length
  (4 + extra) instead of partial (2 + extra), matching the non-exception path
- Fix strict-aliasing violation in JaguarReset RAM randomization: use SET32()
  macro instead of uint32_t* cast into uint8_t array
- Guard TOM width calculation against HDE < HDB1 underflow: validate
  dispEnd > dispStart before computing width, preventing uint16_t wrap
- Add GPUIsRunning() and GPUDumpState() implementations to match gpu.h
  declarations

Co-Authored-By: Claude Opus 4.6 <[email protected]>
The Doom res hack was a CRY-only special case that doubled pixels when
pwidth=8 and the user enabled a core option. This replaces it with
correct pwidth-aware rendering in all 5 scanline renderers.

When pwidth >= 8, each line buffer pixel is replicated (pwidth/4) times
in the backbuffer, and TOMGetVideoModeWidth returns the scaled display
width. This handles Doom (pwidth=8) and any other game using wide pixel
modes, without requiring a user-facing hack option.

Removes: doom_res_hack variable, virtualjaguar_doom_res_hack core option.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Two fixes for DSP emulation accuracy:

1. Run DSPExec() in JaguarExecuteNew() main loop alongside GPU.
   Previously DSP only ran via SoundCallback(), starving games
   that use the DSP for non-audio work (e.g. WMCJ handshake).

2. Dispatch pending interrupts immediately when a flags write
   enables INT_ENA while the corresponding INT_LAT is pending.
   Real hardware fires the IRQ within one cycle of the enable;
   without this, games that rely on CPU-to-DSP interrupts hang.

Also removes temporary debug fprintf instrumentation and adds
DSPGetRAM() accessor for the test harness.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Two corrections to the DSP execution fix:

1. Move DSPHandleIRQsNP() call AFTER CINT latch clearing. When an
   ISR writes to dsp_flags to clear IMASK and CINT0 simultaneously,
   dispatching before CINT clearing caused re-entrant interrupts
   (INT_LAT0 still pending when dispatch checked). Moving dispatch
   after CINT ensures the latch is cleared before checking.

2. Revert DSPExec() from JaguarExecuteNew(). SoundCallback() in
   dac.c already runs the DSP each frame. Running it in both places
   gave the DSP double execution time, causing audio/video glitches.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
… constants

The DSPGO auto-ack threshold of 64 was too aggressive — normal gameplay
status checks accumulated across frames and killed legitimate DSP sound
programs after ~2 seconds, causing BattleSphere and WMCJ to go black
after reaching menus.  Raise to 8192 (tight boot-time poll loops still
trigger within one frame) and reset the counter on DSP_CTRL writes.

Narrow DSP RAM auto-ack from 5KB (offset >= 0x9D0) to the 16-byte BIOS
sound command area ($F1B9D0-$F1B9DF) to stop destroying non-command game
data.

Also:
- Replace 30+ inline magic numbers in HLE init and DSP with named #defines
- Fix DSP IRQ dispatch: only re-dispatch from external callers (M68K/GPU),
  not when DSP itself writes flags during ISR return
- Fix DSP INT_LAT5 extraction bit shift (>>11 not >>10)
- Fix DSP sat32s to use sign-extended 40-bit accumulator
- Add TOM video register defaults (HS, HVS, HDB2, VEB, VEE, HEQ, BG)
- Add TOM width/height bounds checking to prevent buffer overflows
- Zero-fill RAM and DSP RAM in HLE mode (BIOS clears these; games assume it)
- Add test_rom_smoke batch tester and WMCJ/HLE boot debug harnesses

Tested: BattleSphere and WMCJ maintain video output through 1800 frames
(30 seconds) in HLE mode. Full ROM suite: 14 OK, 2 NO_VIDEO (expected),
0 crashes, no regressions.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 27 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/test_m68k_ops.c Outdated
Comment thread test/test_gpu_ops.c Outdated
Comment thread src/log.h Outdated
Comment thread test/test_rom_smoke.c
- Revert DSP IRQ dispatch `who != DSP` guard that broke BIOS mode
  by preventing the DSP from self-dispatching interrupts during ISR
  return. WMCJ with BIOS dropped from 8.1M to 4.9M pixels at 900
  frames; now restored to 8.1M.
- Export DSP/GPU/memory symbols in link.T so Linux CI can dlsym
  them for DSP instruction set tests (was "Missing: DSPReset").
- Remove `inline` from vj_log_stderr in log.h (MSVC compatibility).
- Add pack assertion in test_gpu_ops.c (was always-pass).
- Fix LEA test comment inconsistency in test_m68k_ops.c.
- Add <strings.h> for strcasecmp in test_rom_smoke.c.
- Remove tomWidth bounds guard from TOMExecHalfline (already
  clamped in TOMGetVideoModeWidth).

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 28 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/core/jaguar.c Outdated
Comment thread src/dsp.c Outdated
Comment thread link.T Outdated
Comment on lines +4 to +18
DSP*;
dsp_*;
m68k_*;
jaguarMainRAM;
jaguarMainROM;
jagMemSpace;
pcQueue;
pcQPtr;
a6Queue;
d0Queue;
GPUReadLong;
GPUWriteLong;
JERRY*;
TOM*;
tomRam8;
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linker version script now exports a wide set of internal/testing symbols (e.g., DSP*, dsp_*, m68k_*, pcQueue, tomRam8, etc.). This expands the shared-library ABI surface area and makes it harder to change/refactor internals later (and may unintentionally allow external consumers to depend on non-API symbols).

If these exports are only needed for local/CI test harnesses, consider gating them behind a build-time define (separate version script for test builds) or limiting exports to the minimum set required.

Suggested change
DSP*;
dsp_*;
m68k_*;
jaguarMainRAM;
jaguarMainROM;
jagMemSpace;
pcQueue;
pcQPtr;
a6Queue;
d0Queue;
GPUReadLong;
GPUWriteLong;
JERRY*;
TOM*;
tomRam8;

Copilot uses AI. Check for mistakes.
Comment thread src/m68000/m68kinterface.c Outdated
@JoeMatt JoeMatt linked an issue Apr 26, 2026 that may be closed by this pull request
JoeMatt and others added 3 commits April 26, 2026 18:45
… DSP IRQ

- Export GPU*/gpu_* symbols in link.T so test_gpu_ops can dlsym GPUReset
  on Linux (was "Missing: GPUReset" in CI)
- Fix PACK instruction test: expected value was 0x01E0 but correct result
  is 0x08E0 (shift arithmetic error in test comment)
- Update regression baselines for pwidth resolution changes
- Restore DSP INT_LAT5 extraction fix (>>10 → >>11) that was accidentally
  reverted in 3e0a4e7 — >>10 reads VERSION[3] instead of INT_LAT5 (bit 16)
- Remove DSP debug trace buffer (dsp_ctrl_log struct/array, 21 lines)
- Add SRAM loading, input injection, and per-frame video tracking to
  test_rom_smoke for interactive regression testing
- Add subsystem init, timeline, IRQ cascade, and audio pipeline test
  harnesses for BIOS vs HLE comparison

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Remove immediate DSPHandleIRQsNP() call from FLAGS write handler —
it could dispatch a new interrupt before the current ISR's return
instruction executes, corrupting the DSP stack. The original deferred
IMASKCleared mechanism at the top of DSPExec handles this correctly.

Add periodic audio diagnostic logging (every 60 frames) in
SoundCallback to help diagnose the BIOS audio silence regression.
Logs DSP control/flags, I2S config, LTXD values, and whether
samples are non-zero.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Signed-off-by: Joseph Mattiello <[email protected]>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 35 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/test_rom_smoke.c
Comment thread test/test_audio_diag.c Outdated
Comment thread src/dac.c Outdated
Comment thread src/m68000/m68kinterface.c
Snapshotted DSP RAM at frame 5 and 10 with BIOS and HLE, then binary-
searched the embedded jaguarBootROM blob for the matching prefix.

Key findings:
- The real Jaguar BIOS pre-loads a 1992-byte DSP audio engine from
  jaguarBootROM[0x214E..0x2916] into DSP RAM offset 0 and sets DSPGO=1.
- Engine prefix: 98 00 B0 30 00 F1 D0 00 ... (MOVEI #$F1D000, R0 —
  wavetable ROM pointer — then NOP slots).
- But copying this engine into DSP RAM during HLE init does NOT fix
  Skyhammer or Iron Soldier 2.  Both titles overwrite the engine with
  their own DSP code by ~frame 30, so having it pre-loaded is moot.
- Atari Karts (negative control) is unaffected by the engine copy.
- Skyhammer's HLE-mode DSP RAM at frame 175 is dramatically different
  from its BIOS-mode DSP RAM at frame 175 (~95% divergence across the
  audio engine area).  The 68K code is reading something early-boot
  to choose which DSP audio routine to load, and HLE provides a
  different value than BIOS.

Most plausible remaining hypothesis: Skyhammer JSRs through a BIOS-
installed exception vector (the BIOS leaves handler addresses like
06066xxx, 06067xxx; HLE installs simple RTE stubs at different
addresses).  If that's the audio-init path, our RTE stub returns
immediately and the BIOS audio-init routine never runs.

This commit is documentation only — the engine copy was tried and
reverted because it did not fix the bug.  The investigation hands
off to the next person with a concrete next step rather than an
open-ended DSP archaeology task.
@JoeMatt JoeMatt requested a review from Copilot April 30, 2026 03:56
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

JoeMatt added 9 commits April 30, 2026 00:35
Wolf3D HLE black-screens on iOS RetroArch (Metal) where the same dylib
renders correctly on macOS arm64.  Diagnostic instrumentation
(DEBUG_PRESENTATION compile flag) showed the geometry oscillates 326↔260
every couple frames as the cart switches between full-width wallpaper
and the gameplay viewport.  The old SET_GEOMETRY-after-submit ordering
left a one-frame window where TOM rendered the new tomWidth (e.g. 326)
into rows still spaced at the previous screenPitch (e.g. 320), which
overlaps row tails into the next row's start.  iOS Metal additionally
re-allocates the source texture on SET_GEOMETRY and can drop frames
that arrive between submission and reallocation.

Move the geometry-change check to the START of retro_run, before
JaguarExecuteNew/video_cb.  Now SET_GEOMETRY + JaguarSetScreenPitch
fire before TOM's scanline renderer is invoked, so tomWidth and
screenPitch stay in sync for the entire frame and the frontend's
texture allocation matches the buffer we hand it.

Test 10g in test_hle_bios.c was asserting the OLD ordering (frame
submitted at the previous pitch); update it to the new (correct) order
and add a new assertion that no spurious SET_GEOMETRY fires when the
width is stable.  Headless A/B against libretro/master and against this
branch's prior tip both still produce the expected gameplay screenshot
at frame 1800.

Also wire a DEBUG_PRESENTATION compile flag (make DEBUG_PRESENTATION=1)
into Makefile + libretro.c that enables periodic LOG_INF dumps of
tomWidth/tomHeight/screenPitch/sample-pixels/ltxd-rtxd/DSPIsRunning
from retro_run.  Costs nothing at default (ifdef'd out) and gives the
next person a one-step diagnostic when a frontend reports
black/garbage frames.
Cluster investigation across the still-broken cart titles produced two
sub-agent reports plus a manually-driven boot-progression sweep. Headline:

1. SCLK/SMODE divergence is real and now fixed.
   The HLE init was writing SCLK=0x08 (~46 kHz I2S) and SMODE=0x01
   (INTERNAL only).  The real BIOS audio engine ends up at SCLK=0x13
   (~20 kHz) and SMODE=0x15 (INTERNAL + WSEN + FALLING).  Update HLE
   defaults to match.  Test 9b in test_hle_bios.c was asserting the
   old values; updated to assert the new (BIOS-accurate) values.

2. SCLK/SMODE alone does NOT fix Skyhammer / IS2 audio clipping.
   Saturation density essentially unchanged (25.4% / 20.6%).  Atari
   Karts negative control still produces clean audio.  The narrower
   diagnosis from the agent's snapshots: Skyhammer's 68K is stuck at
   0x008022EE in a DBF delay loop for HLE frames 1-60, while BIOS
   reaches mainloop 0x000059B0 by frame 10.  The DBF loop is waiting
   on something that does not fire under HLE timing — likely an I2S
   sample count or DSP completion.

3. 4 of 5 hang/crash titles fail equally with BIOS and HLE.
   Boot timeline (per-frame PC, framebuffer non-black count, DSP state)
   for Hyper Force / Iron Soldier / Hover Strike / Ruiner Pinball /
   Super Burnout in BOTH bios=enabled and bios=disabled shows
   identical or near-identical end-state (same stuck PC, same pixel
   counts, same DSP state).  These are real emulation bugs, not
   HLE-init issues.  Real BIOS does not fix them either.

Add test/tools/flicker_detect.c — a sliding-window per-pixel temporal
stddev computer that produces a flicker-score timeline, histogram,
and downsampled spatial flicker map.  Atari Karts baseline measures
mean=4.81; NBA Jam TE 12.6, Towers II 12.4, Tempest 2000 6.2 — gives
the in-game flicker bugs an objective metric so a future fix has a
regression watcher.  Not wired into make test yet (needs ROMs);
runnable manually with --frames N --press-X A-B input scheduling.
Skipped in scripts/c89-lint.sh as a diagnostic tool.

Update docs/emulation-bug-hunt-todos.md with the cross-cutting finding
and per-title narrowest-clue table.  Raw investigation data files
(snapshots, diffs, screenshots) are left in /tmp/ for follow-up.
Skyhammer:
- The agent's 'stuck at 0x008022EE' was an early-frame snapshot of a
  long delay loop (DBF on D0=$00FFFFFF, ~167M cycles ~6 s).
  Boot_timeline at later frames (60/300/600/1200/3600/7200) shows PC
  progresses through multiple cart code regions.
- So Skyhammer's main 68K thread is NOT stuck. Audio clipping is a
  DSP/I2S issue, not a cart-side hang.
- Cart-disassembly of the delay loop is not the right approach.

Ruiner Pinball:
- Stuck PC 0x809CAE in both BIOS and HLE. Disassembling around it
  reveals a routine at 0x9CA0: CMPI.L #0,$5B18; BEQ +6; JMP $802000.
  Plus a routine at 0x2248 that JSRs through a function pointer at
  $402C, then conditionally writes MOVE.L #1, $5B18 only if RAM[$4068]
  has specific bits.
- Per-frame probe (frames 1, 5, 30, 60, 120, 300, 600) for both
  BIOS-mode and HLE-mode shows: $402C and $4068 stay 0 forever
  in both modes, while $5B18 cycles through non-zero values.
- So Ruiner is blocked on something that should populate $402C and
  $4068. Real BIOS does not fix it -- implies a cart-side init
  precondition (likely an interrupt that should fire early in boot
  but doesn't) is failing.
- v2.3.0 work: trace the cart's interrupt-handler installation path
  and identify what precondition needs to hold for the init routine
  to run.

Probe tool source at /tmp/probe.c (not committed); reproducible from
the description in the bug-hunt-todos entry.
Wolfenstein 3D HLE is completely silent (RMS=0, first-audio=-1).
Real BIOS produces clean ~3987 RMS audio.  Different failure mode
from Skyhammer/IS2 (those clip; Wolf3D is silent) but same family —
both are HLE titles that depend on the BIOS-loaded DSP audio engine.

Tried memcpy'ing the BIOS engine bytes from
jaguarBootROM[0x214E..0x2916] into DSP RAM at offset 0 and starting
the DSP with D_PC at entry (0xF1B000) and at the BIOS-mainloop offset
(0xF1B11C), DSPGO=1.  Both resulted in the DSP executing for a few
instructions then escaping DSP RAM — dsp_pc ends up at addresses
0x0000008A or 0x00000074 in main RAM, executing nonsense.  Cause:
the engine's code reads DSP registers (R0-R31) using them as jump
targets, and BIOS would have left those registers in a known state
we are not replicating.

Reverted the engine copy.  Left jagbios.h included and DSPGet RAM
declared so the next attempt at fixing this can replace the comment
block in JaguarReset() with a working version.  Updated
docs/emulation-bug-hunt-todos.md with the Wolf3D-specific entry.

Hover Strike, Iron Soldier, Iron Soldier 2 (post-character-select),
Hyper Force, Ruiner Pinball, Super Burnout — these still need
real engine-level fixes; the HLE BIOS engine workaround does not
help them because they fail equally with real BIOS.
Earlier sub-agent investigation reported Raiden HLE was failing due
to an exception double-fault (SR=0x2100 trace flag, A1/A5 pointing
at TOM regs).  That diagnosis was wrong — the agent's snapshot tool
was dereferencing 'jaguarMainRAM' as if it were 'uint8_t[]' but the
symbol is actually 'uint8_t *' (a pointer variable), so dlsym
returned the address of the pointer not the RAM contents.  Reading
through that bad pointer produced apparently-corrupt vector tables
that looked like an exception loop.

Corrected (with proper deref via dlsym('jaguarMainRAM') -> uint8_t **):
- HLE init runs correctly: RAM[4..7] = 0x00802000 (initial PC),
  vectors 4-255 = 0x00000404 (RTE stub), 0x400 has the stub bytes.
- Raiden's cart at 0x802000 copies 0x960 bytes from cart 0x802026 to
  RAM 0x180000+, then JMPs there.
- The copied code at 0x180000 runs initial setup (BSRs to 0x18031E
  and 0x18067A, then LEA $3A8(PC),A4 -> A4=0x1803B2), then enters
  a polling loop at 0x18014A: 'TST.B $2C7(A4); BEQ.S -4' i.e. spin
  forever until RAM[0x180679] becomes non-zero.
- Stack pointer A7 stays at 0x001FFFFC (one push deep), confirming
  zero interrupts are actually firing during the spin.

So the real Raiden HLE bug: the cart installs its own IRQ handlers
in the BSR at 0x180000 / 0x180004 before the poll loop, but those
handlers never run because interrupts are not arriving.  Real fix
needs to trace what those BSRs install and why our HLE state
prevents interrupts from firing (TOM video IRQ enable / JERRY IRQ
enable / vector base / similar).
….2.0

Wolf3D audio investigation finalized after deeper testing:

- HLE: completely silent for the entire run (verified up to 30 seconds /
  1800 frames).  RMS=0, first-non-silent-frame=-1.
- BIOS: BIOS startup-tone audio plays for frames 34 to ~600, then
  silent forever.  The post-frame-600 audio is NOT Wolf3D's own — it's
  the BIOS chime before the cart takes over.  Wolf3D's actual game
  audio never starts in either mode.
- User-reported: real RetroArch on iOS confirms no audio in either
  mode, plus a BIOS-specific quirk where pressing A/B during the BIOS
  logo stops Wolf3D from booting (no other game does that).

Root cause from DSP snapshots at frame 1800:
- Atari Karts HLE (working control): dsp_pc=0xF1B3FA inside DSP work
  RAM (0xF1B000-0xF1CFFF), LTXD/RTXD active.
- Wolf3D HLE: dsp_pc=0x000003FA in main RAM (invalid for DSP),
  LTXD/RTXD silent.
- Wolf3D BIOS: dsp_pc=0x00181C43 in main RAM (also invalid),
  LTXD/RTXD silent.

So Wolf3D's DSP code (regardless of who loaded it — cart or BIOS)
escapes DSP work RAM by reading some register that holds garbage and
using it as a jump target.  Atari Karts initializes that register
properly; Wolf3D depends on the BIOS engine's full register-bank
setup which we don't replicate.  Same root-cause family as
Skyhammer / IS2 audio clipping (different failure modes — Wolf3D
silent, Skyhammer clipped, IS2 clipped).

Update docs/WHATSNEW Known issues section to call out the audio
issue specifically and the BIOS-vs-HLE equivalence finding for the
hang titles (Hyper Force / Iron Soldier / Ruiner Pinball / Super
Burnout fail identically with real BIOS, so they're real engine
bugs not HLE-init issues — clearer guidance for v2.3.0 work).
Signed-off-by: Joseph Mattiello <[email protected]>
Pre-merge manual review surfaced 22 items.  Most were doc /
clarifying-comment work; a few were minor refactors.  No behaviour
changes; all tests still green.

Code:
- src/core/jaguar.c: move HLE constants block out of JaguarReset()
  to file scope so the names don't leak past the function with no
  obvious owner; add JAGUAR_RAM_SIZE / VECTOR_TABLE_BYTES /
  HLE_SSP_CART / HLE_SSP_RAMLOAD constants and use them in place
  of magic numbers in JaguarInit() and JaguarReset().  Improve the
  HLE_BIOS_WORK_FLAG_ADDR ($0804) comment to be specific about
  what game polls it (Battle Sphere) and what condition triggers
  the use case, plus a cross-reference to test_bios_diff.c.
- src/core/vjag_memory.c: name the 0xF20000 jagMemSpace size as
  JAG_MEMSPACE_BYTES with the Jaguar memory-map breakdown comment.
- libretro.c: extract the valid_extensions string to a named
  JAGUAR_VALID_EXTENSIONS constant near the other config macros.
- src/jerry/dac.c: expand the JERRYI2SCallback header comment with
  a TODO(v2.3) for proper SCLK-driven resampling and a cross-
  reference to the Skyhammer/IS2 audio-clipping family.
- src/jerry/dsp.c: expand the HLE sound-engine auto-ack comment to
  spell out exactly what the workaround does, the conditions, why
  it's a workaround not a real fix, and link to the v2.3 work.
- src/m68000/cpuemu.c: expand the BSR.L $61FF Jaguar-quirk comment
  with a reference (Removers `aln` linker JAG_HACK branch + Jaguar
  68000 assembler manual) and which titles depend on it.

Docs:
- WHATSNEW v2.2.0:
  * Mention input remap via core options (RetropadOptionMapping).
  * Mention the libretro geometry pre-render fix that unblocked
    Wolf3D on iOS Metal RetroArch.
  * Spell out the retro_run() ordering change (input -> DAC ->
    JaguarExecuteNew -> cheats -> SoundCallback -> video_cb) so
    audio sees the same JERRY state the frame was rendered against.
  * Mention the source-tree reorganization + magic-number
    promotion in the cleanup section.
- docs/emulation-bug-hunt-todos.md: add a new "v2.3.0 follow-up
  notes" section capturing what was deliberately removed and why
  (STUBULATOR comment, src/mmu.c, vjs.hardwareTypeAlpine, some
  NEW_SCOREBOARD #ifdef arms), the comments / TODOs that should
  not fall off (blitter `!!! FIX !!!`, HLE sound auto-ack,
  JERRYI2SCallback, baseline 241px miniretro quirk), and the
  code-organization items for v2.3.0 (dsp.c file split, link.T
  export gating, version.h, optional clang-tidy / cppcheck CI).
Pre-tag wiring for the v2.2.0 release.  The workflow already builds
14 platforms on tag push and creates a GitHub release with binaries
attached; this commit adds three things:

1. Split debug symbols.  Each platform-specific Package step now
   extracts split debug info from the optimized binary (objcopy
   --only-keep-debug for Linux/Android/Windows, dsymutil for
   macOS/iOS/tvOS, llvm-objcopy for Android NDK, arm-vita-eabi-objcopy
   for Vita, aarch64-none-elf-objcopy for Switch) and ships it as
   `<platform>-debug.tar.gz` next to the stripped binary.  Emscripten
   keeps debug info inline in the .bc (LLVM bitcode), so its -debug
   archive is just the gzipped bitcode for symmetry.

   Makefile gains a RELEASE_DEBUG_INFO=1 knob that appends -g to the
   release flags (no effect under DEBUG=1 or MSVC; both already do
   the right thing).  release.yml sets it on every Build step.

2. SHA256SUMS.txt.  After all artifacts are downloaded, the release
   job runs sha256sum across the staged files and writes a sorted
   SHA256SUMS.txt that ships alongside the binaries.  Verifies on
   Linux + macOS coreutils.

3. Curated release body.  The release job now reads
   docs/RELEASE_NOTES_v<TAG>.md if present and uses it as the
   release body (`gh release create --notes-file`); otherwise falls
   back to GitHub's auto-generated PR/commit list
   (`--generate-notes`).  An artifact listing is appended to the
   curated body so the release page always shows what was uploaded.

   docs/RELEASE_NOTES_v2.2.0.md is added — generated from
   docs/WHATSNEW v2.2.0 prose + git shortlog/diffstat against
   libretro/master (since this is the first-ever tagged release on
   the libretro fork there's no prior tag to diff against).

Local verify: RELEASE_DEBUG_INFO=1 make produces an arm64 dylib that
dsymutil can crack into a 2 MB DWARF bundle; make test still green;
actionlint clean (one shellcheck nit fixed: ls -> find for
non-alphanumeric file safety in the artifact-list line).

Untested in CI yet (we've never tagged); the workflow runs only on
push of a v* tag, so first real exercise will be when v2.2.0 is
pushed after merge.  If anything fails the release job, the
workflow can be re-run from the Actions tab without re-tagging.
@JoeMatt JoeMatt added enhancement 🧰 crash 🔥 Causes a crash performance 🚀 ci/cd 🤖 Nothing to see, just CI/CD. game specific bug 🎮 input 🎮 Related to controller or other user inputs 📖 documentation labels May 1, 2026
JoeMatt added 6 commits April 30, 2026 21:33
Signed-off-by: Joseph Mattiello <[email protected]>
Signed-off-by: Joseph Mattiello <[email protected]>
Pre-tag review flagged that link.T currently exports a wide internal
symbol surface (DSP*, dsp_*, m68k_*, Jaguar*, jaguar*, GPU*, gpu_*,
JERRY*, TOM*, OP*, jaguarMainRAM, jagMemSpace, regs, sclk, smode,
lowerField, vjs, ...).  These were added so the headless white-box
test harnesses can dlsym into emulator state, but shipping them as
the production ABI is a long-term liability — frontends could pin
themselves to internal symbols and we'd have to keep them stable
across refactors.

Split into two version scripts:

- link.T       : production ABI (retro_* only).  Used by `make`
                  default and the release.yml workflow.
- link-test.T  : the previous wide set, used only when the Makefile
                  is invoked with TEST_EXPORTS=1.  The `test` target
                  re-invokes make with TEST_EXPORTS=1 so the .so
                  produced for the test binaries has the wider
                  symbol set, while the .so produced by `make`
                  (default) hides internal symbols.

Replace `link.T` with `$(LINK_SCRIPT)` in the 5 platform-specific
SHARED definitions in Makefile (unix / classic_armv7_a7 / qnx /
armv* / generic shared-build).  Add an `LINK_SCRIPT` variable at
the top of the Makefile that picks link.T vs link-test.T based on
TEST_EXPORTS=0/1.

Force a re-link of the .so when `make test` is invoked without
TEST_EXPORTS=1 already set: the outer test target rm's the .so
then re-invokes `$(MAKE) TEST_EXPORTS=1 test`.  After `make test`,
the on-disk .so has the wide exports — re-run `make` (no flag) to
restore the production-slim ABI for shipping.

Note: this only takes effect on platforms that link with GNU ld
--version-script (Linux, Windows MSYS2/MinGW, ARM, QNX).  macOS /
iOS / tvOS dylibs ignore --version-script and currently still
export everything with default visibility.  Slimming those needs
-Wl,-exported_symbols_list and a separate exports list — punted to
v2.3.0 (noted in docs/emulation-bug-hunt-todos.md follow-up
section).

Verified locally:
- `make` (default) produces .so without errors.
- `make test` re-links with link-test.T and all tests still pass
  (211 passes in test_hle_bios, full suite green).

Updated docs/emulation-bug-hunt-todos.md follow-up section to
remove the "v2.3 link.T export gating" item from the open list and
mark it done as of v2.2.0; macOS/iOS dylib export-list slimming
is the remaining piece for v2.3.0.
Refresh the v2.3.0 follow-up section in docs/emulation-bug-hunt-todos.md
to reflect 5a50de9 — link.T is now slim (retro_* only); link-test.T
carries the wide symbol set for white-box testing; Makefile chooses
between them via TEST_EXPORTS.  Remaining piece for v2.3.0 is macOS /
iOS / tvOS dylib exports list (those linkers ignore --version-script).
The .info file in libretro/libretro-super/dist/info/ that ships with
RetroArch is currently stale (display_version = v2.1.0; savestate /
cheats / cheevos all reported as unsupported even though we have all
three).  Add `dist/info/virtualjaguar_libretro.info` to this repo as
the source-of-truth so it lives next to the code that determines its
keys, and so the release workflow can ship it as an artifact.

Every key was cross-checked against the matching libretro.c
advertisement:
- supported_extensions = "j64|jag|rom|bin"  (matches JAGUAR_VALID_EXTENSIONS)
- savestate / savestate_features = "true" / "2"  (deterministic for run-ahead)
- cheats = "true"            (retro_cheat_set / _reset present)
- input_descriptors = "true" (SET_INPUT_DESCRIPTORS in retro_load_game)
- memory_descriptors = "true" (SET_MEMORY_MAPS for cheevos memory map)
- libretro_saves = "true"    (RETRO_MEMORY_SAVE_RAM via SRAM interface)
- core_options = "true"      (SET_CORE_OPTIONS_V2 via libretro_core_options.h)
- load_subsystem = "false"   (no subsystems advertised)
- hw_render = "false"        (software renderer)
- needs_fullpath = "false"   (content via memory; we set this in retro_get_system_info)
- supports_no_game = "false" (cart required)
- disk_control = "false"     (NO Jaguar CD support yet — that's on a separate branch)
- is_experimental = "false"

Note: deliberately did NOT carry CD-related extensions (cdi/cue/iso)
or abs/cof/prg from the test/roms/private/ stale copy.  Those formats
live on the in-flight CD branch and should land in a future PR
together with the matching disk_control wiring.

release.yml: stage `dist/info/virtualjaguar_libretro.info` into the
release/ dir alongside the binaries so it's published as a release
asset.  Maintainers / users can drop it directly into RetroArch's
info/ dir, and the libretro-super PR can copy from this file.

Add docs/release-process.md walking through the tag -> CI ->
GitHub release -> libretro-super PR flow, including the recipe
for the libretro-super PR (the .info update is manual; there's
no automated mirror).
…ing)

The loader's ParseFileType() in src/core/file.c routes by file header
bytes, not by extension:
- .abs / .cof : COFF / Removers aln output (JST_ABS_TYPE1 / TYPE2)
- .prg        : headerless raw with 68k bootstrap (JST_RAW_BINARY)

Master had these in supported_extensions before the rewrite; they got
dropped when JAGUAR_VALID_EXTENSIONS was named.  Re-add so RetroArch
won't filter homebrew with those extensions out of its file picker.

Updated both:
- libretro.c JAGUAR_VALID_EXTENSIONS macro
- dist/info/virtualjaguar_libretro.info supported_extensions key

Still excluded: cdi / cue / iso / chd — those go in once the CD branch
lands on a future PR with the matching disk_control wiring.
@JoeMatt JoeMatt merged commit 4fcf958 into libretro:master May 1, 2026
21 of 24 checks passed
JoeMatt added a commit that referenced this pull request May 2, 2026
Adds 9 more tests across the gap categories per user direction:

bus/ (new category) -- 1 PASS / 1 FAIL
  cpu_blitter_concurrent  PASS  -- 68K reads SRC right after blit
                                   issue; passes because our blitter
                                   is synchronous (no real bus race)
  blitter_back_to_back    FAIL  -- 4 successive blits to different
                                   dests; same root-cause as the rest
                                   of the blitter category

op/ -- +1 PASS
  op_branch_object        PASS  -- BRANCH (type 3) jumps to STOP

irq/ -- +1 PASS
  sr_mask_blocks_irq      PASS  -- 68K SR I=7 blocks even with TOM
                                   IRQs enabled (companion to
                                   irq_mask_suppresses which tests
                                   the TOM-side mask)

quirks/ -- +2 PASS
  a2_yadd_tied_to_a1      PASS  -- Jaguar 1 hardware bug (A2 yadd
                                   forced to track A1's) verified
                                   present
  illegal_opcode_traps    PASS  -- 68020 MULS.L emulated through
                                   illegal-instruction trap
                                   (commit 4fcf958 / PR #119)

memory/ -- +1 PASS
  unaligned_word          PASS  -- vector-3 install + restore path
                                   doesn't crash (real misaligned
                                   load deferred -- vasm warns)

blitter/ -- +1 PASS
  lfu_zero_fill           PASS  -- LFU=0 zeroes destination
                                   (notable: PASSES while every
                                   other blitter test FAILs, narrows
                                   the bug to the source-data path)

timing/ -- +1 PASS
  halfline_count_per_frame PASS -- masks the lower-field bit and
                                   counts ~524 halflines/frame NTSC
                                   (off-by-field-bit on first
                                   attempt, fixed)

README updated with Docker / alternative-toolchain options
(toarnold/jaguarvbcc, Leffmann/vasm, rmac).  Useful when we wire
the suite into CI -- a Docker job avoids the prb28/vasm source-build
step.

Status: 27 / 37 passing.  Same 3 root-cause clusters as before:

* Blitter writes don't land (5 tests + 1 stress + 1 bus = 7 fails),
  EXCEPT lfu_zero_fill which PASSES.  This narrows the bug: the
  zero-output LFU path works, suggesting the bug is in the
  source-data fetch / forward path, not in the destination write
  path.  Highest-priority follow-up.
* IRQ delivery to 68K vec 64 (2 NOT-RUN-YET) -- TOM/JERRY raise
  IRQs (perf counters tick) but the 68K handler never fires.
* JERRY PIT register readback (1 FAIL) -- writes a value, reads
  back zero.

Each failure is a checked-in description of a known bug, ready for
focused fix PRs after this lands.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
JoeMatt added a commit that referenced this pull request May 3, 2026
…ries

User asked: "GPU execution, DSP MAC, OP scaled bitmap, real \$61FF
BSR.L emit, more LFU variants ... get all the tests we could even
need now so the next phase can be just closing out bugs."

Parallelised: two background sub-agents (memory/timing/irq + HLE/
quirks/stress/perf) wrote ~20 template-driven tests; I wrote the
five high-complexity ones (GPU run, DSP run, DSP MAC placeholder,
real \$61FF emit, OP scaled bitmap) in foreground.  35 new tests
land in this commit.

New tests by category:

blitter/ (+10 -- agent A)
  lfu_passthrough_src   FAIL  -- LFU=\$C explicit
  lfu_invert_src        PASS  -- LFU=\$3 (~S); SRC read works here
  lfu_or                FAIL  -- LFU=\$E (S|D), DSTEN=1
  lfu_xor               FAIL  -- LFU=\$6 (S^D), DSTEN=1
  lfu_and               FAIL  -- LFU=\$8 (S&D), DSTEN=1
  lfu_one_fill          PASS  -- LFU=\$F (always 1), no operands needed
  dsta2_swap            FAIL  -- DSTA2 role-swap (A2=dest, A1=src)
  bcompen_basic         FAIL  -- bit-comparison enable (font path)
  gourd_basic           FAIL  -- gouraud shading liveness
  bkgwren_test          FAIL  -- BKGWREN + DCOMPEN

memory/ (+4)
  gpu_local_ram         PASS  -- read/write GPU RAM at \$F03000
  dsp_local_ram         PASS  -- read/write DSP RAM at \$F1B000
  ram_walking_one       PASS  -- walking-1s pattern (no stuck bits)
  ram_byte_word_align   PASS  -- \$12345678 read as 4 bytes / 2 words

timing/ (+3)
  vc_starts_low         PASS  -- VC reset to <525 on cart boot
  vc_increments         PASS  -- VC moves
  hc_within_scanline_range PASS -- HC bounded

irq/ (+2)
  vector_64_writable    PASS  -- vector \$100 RW round-trip works,
                                 confirms IRQ-delivery bug is NOT
                                 in the vector-write path
  tom_int1_readback     PASS  -- TOM_INT1 enable mask is documented
                                 write-only (per src/tom/tom.c:85);
                                 test pins down that semantic so a
                                 future change can't silently make
                                 it readable (rewritten after agent
                                 surfaced the spec)

gpu/ (+1, manual)
  gpu_basic_run         PASS  -- load 16 NOPs, set G_PC, GO, verify
                                 G_PC advanced.  GPU executes!

dsp/ (+2, manual)
  dsp_basic_run         PASS  -- same shape as gpu_basic_run
  dsp_mac_accumulator   PASS  -- placeholder; runs NOP loop today;
                                 real 40-bit-MAC math is a follow-up
                                 (movei + imacn + resmac sequence
                                 with proper DSP register
                                 addressing)

op/ (+1, manual)
  op_scaled_bitmap      PASS  -- 3-phrase scaled bitmap object
                                 followed by STOP; sentinel survives
                                 (OP doesn't crash on type=2 objects)

quirks/ (+4)
  bsr_l_61ff_real       PASS  -- emits raw \$61FF + 32-bit absolute
                                 target; verifies our 68K core's
                                 PR-#119 patch still routes the
                                 Atari aln linker BSR.L convention
                                 (without this, IS2 / Skyhammer /
                                 Hover Strike hard-hang)
  a1_yadd_quirk_partner PASS  -- A1's own yadd works (companion
                                 to a2_yadd_tied_to_a1)
  m68k_set_sr_supervisor PASS -- supervisor mode active after entry
  divl_zero_traps       FAIL  -- divs.l #0 should trap to vector 5;
                                 handler doesn't fire.  Real bug or
                                 inline-encoding mismatch -- needs
                                 follow-up

hle/ (+4)
  hle_ssp_value         PASS  -- SSP at \$0 = \$00004000 (cart-mode)
  hle_reset_pc          PASS  -- reset PC at \$4 = \$00802000
  hle_border_color      FAIL  -- TOM_BORD1/2 reads back as \$01F4
                                 instead of 0; **real HLE init bug**
  hle_vector_4_is_rte   PASS  -- vec-4 handler is RTE (\$4E73)

stress/ (+2)
  rapid_irq_pump        NOT-RUN-YET -- 60 VBlank IRQs expected;
                                 handler never fires (same root
                                 cause as vblank_delivery)
  deep_call_chain       PASS  -- 16 deep BSR/RTS round-trip

perf/ (+2)
  gpu_loop_stub         PASS  -- 10000-iter 68K loop baseline
  dsp_loop_stub         PASS  -- ditto, distinguishable in profile

Real bugs surfaced (ready for fix-PRs after this lands):

1. Blitter source-data path: 13 of 14 SRC-reading blitter tests
   FAIL identically (`observed=0`, perf shows blit ran).  Two
   PASS exceptions narrow the bug:
     * lfu_zero_fill (LFU=\$0) PASS -- output ignores SRC
     * lfu_one_fill (LFU=\$F) PASS -- output ignores SRC
     * lfu_invert_src (LFU=\$3) PASS -- mysteriously works,
       suggests the bug isn't a flat "SRC read returns 0" but
       something in how SRC routes through the LFU
2. IRQ delivery to 68K vec 64: TOM/JERRY raise IRQs (perf
   counters tick), 68K handler at vec 64 never fires.  Likely
   load-bearing for the Doom 2x speed regression (issue #131).
   3 tests document this: vblank_delivery, jerry_pit_irq,
   rapid_irq_pump.
3. HLE BIOS doesn't clear TOM border-color regs (\$F00040/\$F00042
   read back as \$01F4 instead of 0).
4. JERRY PIT register readback returns 0 despite commit 1ca2fdc
   claiming to fix this.
5. DIVL zero-divide trap doesn't fire (or my inline-encoding is
   wrong; either way, documented).

Coverage status:
  smoke    1/1      memory   8/8      timing   9/9
  irq      6/9      blitter  4/17     gpu      2/2
  dsp      3/3      op       3/3      bus      1/2
  hle      5/6      quirks   6/7      stress   2/3      perf  3/3

README updated earlier this PR with Docker / alternative-toolchain
options (toarnold/jaguarvbcc, Leffmann/vasm) for CI hookup.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

2 participants